Machine learning analysis of databases

ABSTRACT

Systems and methods include one or more processors, and a memory storing instructions that, when executed by the one or more processors, in conjunction with a particular machine learning model for a subset of the instructions, cause the system to perform automatically obtaining data of entities from databases based on a frequency at which the data changes, storing the obtained data in a repository, and using the particular machine learning model, performing analysis within the databases.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.15/824,852, filed on Nov. 28, 2017, which claims the benefit under 35U.S.C. § 119(e) of U.S. Provisional Application Ser. No. 62/438,185filed Dec. 22, 2016, the content of which is incorporated by referencein its entirety into the present disclosure.

TECHNICAL FIELD

This disclosure relates to approaches for machine learning analysis ofdatabases.

BACKGROUND

There are various public and private benefit systems in the societywhich are susceptible to misuse. Examples of such benefit systemsinclude healthcare, public housing, food assistance, social security,senior services and community services. Certain commercial programs,such as medical and dental insurance programs, as well as auto, home andlife insurance programs, are also subject to misuse. Among these benefitsystems, the healthcare system, both public and private, is one or themost frequent targets for misuse which results in substantial financialloss and potentially substance abuse.

Under conventional approaches, a database may store information relatingto claims made for payment (e.g., medical procedure claims, medicalequipment claims, prescription claims, doctor office claims, otherbenefit claims, etc.). Reviewing the claims to identify potential misuseof the benefit system (e.g., fraudulent claims, prescription fraud,healthcare abuse/waste, etc.) may be time consuming and very difficult.The amount of time required and the difficulty of detecting potentialfrauds may lead to inaccurate and/or incomplete misuse detection.

SUMMARY

Various embodiments of the present disclosure may include systems,methods, and non-transitory computer readable media configured toautomatically detect misuse of a benefit system. A database of claimsmay be analyzed to determine a healthcare metric. The healthcare metricmay be compared to a healthcare threshold. Based on the comparison ofthe healthcare metric to the healthcare threshold, a first lead forinvestigation may be generated.

In some embodiments, the healthcare metric may characterize arelationship between one or more pharmacy events and one or moreclinical events. In some embodiments, the healthcare metric maycharacterize an amount of opiate doses received by a patient over aperiod of time. In some embodiments, the healthcare metric maycharacterize a billing pattern of one or more healthcare providers. Insome embodiments, the healthcare metric may be determined using mutualentropy.

In some embodiments, the first lead may identify one or more ofpatients, healthcare providers, and/or healthcare events.

In some embodiments, the systems, methods, and non-transitory computerreadable media are further configured to generate a second lead forinvestigation based on the first lead. In some embodiments, the secondlead may identify one or more of patients, healthcare providers, and/orhealthcare events.

These and other features of the systems, methods, and non-transitorycomputer readable media disclosed herein, as well as the methods ofoperation and functions of the related elements of structure and thecombination of parts and economies of manufacture, will become moreapparent upon consideration of the following description and theappended claims with reference to the accompanying drawings, all ofwhich form a part of this specification, wherein like reference numeralsdesignate corresponding parts in the various figures. It is to beexpressly understood, however, that the drawings are for purposes ofillustration and description only and are not intended as a definitionof the limits of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Certain features of various embodiments of the present technology areset forth with particularity in the appended claims. A betterunderstanding of the features and advantages of the technology will beobtained by reference to the following detailed description that setsforth illustrative embodiments, in which the principles of the inventionare utilized, and the accompanying drawings of which:

FIG. 1 illustrate an example environments for automatically detectingmisuse of a benefit system, in accordance with various embodiments.

FIG. 2 illustrates an exemplary process for generating leads based onpharmacy events and clinical events, in accordance with variousembodiments.

FIG. 3 illustrates an exemplary process for generating leads based onopiate doses, in accordance with various embodiments.

FIG. 4 illustrates an exemplary process for generating leads based onbilling patterns, in accordance with various embodiments.

FIG. 5 illustrates a flowchart of an example method, in accordance withvarious embodiments.

FIG. 6 illustrates a block diagram of an example computer system inwhich any of the embodiments described herein may be implemented.

DETAILED DESCRIPTION

A claimed solution rooted in computer technology overcomes problemsspecifically arising in the realm of computer technology. In variousembodiments, a computing system is configured to access and analyze adatabase of claims to determine leads for benefit misuse investigations.For example, in various embodiments, a computing system is configured toaccess and analyze a plurality of claims to determine leads forhealthcare fraud investigations. Information relating to healthcareprovisions (e.g., patient information, procedure performed, medicalequipment used, prescription provided/filled, provider information,etc.) may be parsed from claims data to determine patterns for patientsand/or healthcare providers. The patterns may be analyzed to determinespecific patterns indicative of fraud/health case misuse and particularpatients/healthcare providers/healthcare events may be tagged as a leadfor investigation. For example, combinations of pharmacy events andclinical events may be analyzed to determine whether expected pharmacyevents/clinical events are occurring. If the expected pharmacyevents/clinical events are not occurring, the pharmacy events/theclinical events/patients/healthcare providers may be tagged as potentialleads for healthcare misuse. As another example, the amount of opiatedoses within filled prescription may be tracked to determine whether apatient fits the profile of a drug seeker and the patient/healthcareprovider may be tagged as potential leads. As another example, adoctor's billing pattern may be analyzed to determine likelihood ofbilling fraud (e.g., upcoding) and the doctor may be tagged as apotential lead. The patterns may be analyzed using mutual entropy.Detected leads may be used to determine other leads. For example, a leadfor a drug seeker may be used to track which doctors have providedprescription for the drug seeker, which may be used to determine leadsfor doctors engaged in prescription misuse.

The number of healthcare claims (e.g., medical procedure claims,hospital claims, medical equipment claims, prescription claims, doctoroffice claims, etc.) is in the range of the millions or billions peryear. Individual healthcare claims may include numerous types of data,such as billing codes (e.g., procedure code, diagnosis code, etc.),patient identifier, location, service provider identifier, service date,and the like. Because databases of medical claims may contain vastamount of information, selectively mining the available information foruseful purposes, such as to identify leads to potential fraudulentclaims, is not a trivial task. The present disclosure enables automaticdetection of misuse of a healthcare system. The techniques describedherein enable automatic tagging of healthcare events, patients, and/orhealthcare providers as leads for investigation.

Healthcare waste, fraud and/or abuse may be examples of healthcaremisuse. As used herein, fraud refers to a scheme or artifice to defraudany healthcare program or entity or to obtain any of the money orproperty owned by, or under the custody or control of, any healthcareprogram or entity. Waste refers to the overutilization of services orother practices that, directly or indirectly, result in unnecessarycosts to the healthcare system. Abuse refers to any action that may,directly or indirectly, result in one or more of unnecessary costs tothe healthcare system, improper payment for services, payment forservices that fail to meet professionally recognized standards of care,and/or services that are medically unnecessary.

While the disclosure is described herein with respect to fraud and fraudlead detection, this is merely for illustrative purposes and is notmeant to be limiting. For example, the techniques described herein mayapply to waste lead detection and/or abuse lead detection. Thetechniques described herein may apply to lead detection for misuse ofany type of benefit systems which provide for payment/reimbursement toindividuals and/or organizations for services performed/received and/orequipment provided/received.

FIG. 1 illustrates an example environment 100 for automaticallydetecting misuse of a benefit system, in accordance with variousembodiments. The example environment 100 may include a computing system102 and a database 104. The database 104 may include a database and/or asystem of databases that receive and store data related to healthcareclaims (e.g., medical procedure claims, hospital claims, medicalequipment claims, prescription claims, doctor office claims, etc.)submitted by individuals and/or organizations. The data may be organizedbased on individuals receiving healthcare services/products,individuals/organizations providing healthcare services/products, time(e.g., a particular duration of time), insurance providers, and/or otherinformation. Individuals receiving healthcare services/products may bereferred to as patients. Individuals/organizations providing healthcareservices/products may be referred to as healthcare providers. Healthcareproviders may include facilities, institutions, and/or groups, such ashospitals and clinics, and/or individual practitioners such as doctors,dentists, nurses, pharmacists, and therapists. The database 104 mayinclude supplemental information about the healthcare claims, such asindividual/organization contact information, medical code information,and/or other information.

Although the database 104 is shown in FIG. 1 as a single entity, this ismerely for ease of reference and is not limiting. The database 104 mayrepresent one or more databases and/or one or more storage devicesstoring databases located in the same or different locations. Thedatabase 104 may be located in the same and/or different locations fromthe computing system 102. For example, the database 104 may be storedwithin the memory of the computing system 102 and/or a memory coupled tothe computing system 102. The database 104 may exchange information withthe computing system 102 via one or more networks.

The computing system 102 may include one or more processors and memory.The processor(s) may be configured to perform various operations byinterpreting machine-readable instructions stored in the memory. Asshown in FIG. 1, in various embodiments, the computing system 102 mayinclude a claims engine 112, a metric engine 114, and a lead engine 116,and/or other engines. The metric engine 114 may include an events engine122, a doses engine 124, and a billing engine 126. The metric engine 114may be executed by the processor(s) of the computing system 102 toperform various operations including those described in reference to theevents engine 122, the doses engine 124, and the billing engine 126. Theenvironment may include a data store (not shown) that is accessible tothe computing system 102. In some embodiments, the data store mayinclude various databases, software packages, and/or other data that areavailable for download, installation, and/or execution.

In various embodiments, the claims engine 112 may be configured toaccess and analyze one or more databases of claims. For example, theclaims engine 112 may access and analyze healthcare claims stored in thedatabase 104. In various embodiments, the claims engine 112 may parsethe information contained within the healthcare claims to identifyrelevant information for analysis. In some embodiments, the database 104may include information from healthcare claims which are formatted foraccess by the claims engine 112. In some embodiments, the database 104may include healthcare claims and the claims engine 112 may provide oneor more of clean-up, enrichment, and/or transformation of theinformation within the healthcare claims for analysis. For example, theclaims engine 112 may identify paid vs unpaid claims, trim unnecessarydata, incorporate data from other sources (e.g., patient information,healthcare provider information, etc.) that provides context for thehealthcare claims, remove duplicative information within the healthcareclaims, and/or other transformation to allow the information containedwithin the healthcare claims to be used for misuse lead detection.

In some embodiments, the claims engine 112 may be configured to accessand consolidate information contained in multiple databases of claims.The claims engine 122 may access and extract different information fromdifferent databases of claims for analysis. For example, the claimsengine 112 may collect claims information from databases of claims fromhealthcare providers, databases of claims from insurance companies,databases of claims from publicly available information, and/or otherdatabases.

In some embodiments, the claims engine 112 may incorporate theinformation obtained with the databases of claims and/or other sources(e.g., external sources) into one or more object types defined by one ormore ontologies. For example, the claims engine 112 may create from thehealthcare claims stored in the database 104 different objectscorresponding to different healthcare participants and/or events, suchas healthcare provider objects, patient objects, healthcare eventobjects, service objects, equipment objects, prescription objects,billing objects, and/or other objects. Packaging of information intoobjects may enable selective access and/or modification of theinformation contained within the objects. Information packaged withinthe objects may be accessed and/or modified during misuse leaddetection.

In various embodiments, the events engine 122 may be configureddetermine one or more healthcare metrics based on the analysis of one ormore databases of claims. The healthcare metric determined by the eventsengine 122 may characterize a relationship between one or more pharmacyevents and one or more clinical events. A pharmacy event may refer to amedical event in which a drug is prescribed and/or a prescription for adrug is filled. For example, a pharmacy event may refer to a doctorproviding a drug prescription to a patient and/or a pharmacy providingthe drug to the patient. A clinical event may refer to a medical eventin which a healthcare provider may assess a patient's need forpharmaceutical treatment. For example, a clinical event may include avisit to a doctor's office for a procedure that is typically accompaniedby one or more prescriptions for drugs. As another example, a clinicalevent may include a check-up visit in which a patient's health isexamined to determine whether a new prescription is required and/or aprevious prescription needs to be reissued/refilled.

The healthcare metric may characterize whether one or more pharmacyevents are accompanied by one or more expected clinical events. Forexample, certain types of drugs (e.g., Schedule II drugs) may require apatient to receive a new prescription to get a refill. Based on theanalysis of database(s) of claims, the events engine 122 may determine ahealthcare metric that characterizes whether a patient's filling ofdrugs are preceded by clinical events in which a healthcare providerwould have assessed the patient's need for the drugs and provided thenew prescription. For example, for a particular healthcare providerand/or a particular patient, the events engine 122 may count the numberof pharmacy events (e.g., a pharmacy fills a drug prescription writtenby the healthcare provider) and the number of clinical events that maybe associated with the pharmacy events. The events engine 122 may countthe number of clinical events that occur within a certain period of timebefore and/or after the pharmacy event.

A healthcare provider whose practice includes an expected number ofclinical events associated with pharmacy events may be scored with asatisfactory healthcare metric (e.g., high or low score). A healthcareprovider whose practice includes a lower than expected number ofclinical events associated with pharmacy events may be scored with anunsatisfactory healthcare metric (e.g., low or high score). A healthcareprovider whose practice includes a lower than expected number ofclinical events associated with pharmacy events may be practicing poorstandard of care (e.g., bad pain management—not meeting with patients towhom the healthcare provider is providing prescriptions). A healthcareprovider whose clinical events are not occurring within a certain timeduration of the pharmacy events may be practicing poor standard of care.

A patient who is associated with an expected number of clinical eventsfor pharmacy events may be scored with a satisfactory healthcare metric(e.g., high or low score). A patient who is associated with a lower thanexpected number of clinical events for pharmacy events may be scoredwith an unsatisfactory healthcare metric (e.g., low or high score). Apatient who is associated with a lower than expected number of clinicalevents for pharmacy events may have a healthcare provider practicing badpain management (e.g., not meeting with patients to whom the healthcareprovider is providing prescriptions) and/or may be falsifyingprescriptions.

In various embodiments, the doses engine 124 may be configured todetermine one or more healthcare metrics based on the analysis of one ormore databases of claims. The healthcare metric determined by the dosesengine 124 may characterize the amount of opiate doses received by apatient over one or more periods of time. The doses engine 124 mayleverage information within the database of claims to determine theamount of opiate doses received by a patent. The doses engine 124 mayconvert the amount of opiate doses received by the patient into amorphine equivalent. For example, different types of drugs may beassociated with different levels of morphine, and information about thetypes and amounts of drugs received by a patient may be converted into amorphine equivalent. The healthcare metric may characterize the amountof opiate doses received by a patient based on the morphine equivalentreceived by the patient. In some embodiment, the opiate doses may beaggregated on a periodic basis (e.g., daily, weekly, monthly, yearly,etc.). High amounts of morphine equivalent for a patient for a givenperiod of time may indicate that the patient may be a drug seeker. Insome embodiments, the healthcare metric for the individual patients maybe aggregated to determine the healthcare metric for a healthcareprovider. High score for a healthcare provider may indicate that thehealthcare provider potentially provides prescriptions for one or moredrug seeking patients.

In some embodiments, the healthcare metric determined by the dosesengine 124 may be adjusted based on patient information. For example,the healthcare metric may be adjusted based on the size of the patientso that the variance of amount of opiate doses received by the patientbased on the size of the patient is factored into the healthcare metric.As another example, the healthcare metric may be adjusted based on thepatient's current health condition (e.g., diagnosed diseases) so thatthe variance of amount of opiate doses received by the patient due tothe patient's health condition is factored into the healthcare metric.For instance, a patient diagnosed with cancer may be expected to receivehigher amounts of opiate doses than a patient diagnosed with a cold.

In various embodiments, the billing engine 126 may be configured todetermine one or more healthcare metrics based on the analysis of one ormore databases of claims. The healthcare metric determined by thebilling engine 126 may be characterized by a billing pattern of one ormore healthcare providers. The healthcare metric determined by thebilling engine 126 may indicate whether the healthcare providers may beengaged in fraudulent billing practices. For example, the healthcaremetric may indicate whether the healthcare providers are engaged inupcoding (e.g., using a more expensive code for payment) and/or otherpractices to receive money for medical services/equipment that was notprovided to patients.

In some embodiments, the billing engine 126 may use mutual entropy todetermine whether the healthcare providers are engaging in fraudulentbilling practices. The billing engine 126 may use mutual entropy todetermine mutual information between the healthcare providers' billingsand the patients seen by the healthcare providers. The mutualinformation may indicate whether the healthcare providers' billings(indicating the medical services performed and/or medical equipmentused/provided, etc.) are independent or dependent on the patients seenby the healthcare providers. Mutual entropy may determine whether thereis a connection between a particular patient/visit and the treatmentprovided/billed by the healthcare provider. Mutual entropy may determinewhether there is a connection between a patient and the type oftreatment received by the patient. Healthcare providers that bill forthe same/similar types of treatment regardless of the patientidentity/visit may be engaged in fraudulent billing practices.Healthcare providers with billing entries that are tailored to differentpatients/visits may receive a higher healthcare metric than healthcareproviders with billing entries that include same/similar claims acrossdifferent patients/visits. For example, a doctor who bills for a urinetest for every patient may be scored with a lower healthcare metric thana doctor who bills a urine test for a subgroup of patients. A healthcareprovider with a low healthcare metric determined based on mutual entropymay be engaged in “cookie-cutter billing,” where the healthcareproviders bills for the same/similar treatment regardless of the patientthey see and/or the type of visit.

Calculation of healthcare metric based on mutual entropy may be groupedby specialty/classes of healthcare providers. For example, a certainspecialty (e.g., radiologists) may use a smaller subset of codes forbilling than another specialty (e.g., family medicine). Calculatingmutual entropy across different specialties may result in healthcareproviders in specialties with smaller number of billing codes havinglower healthcare metrics than healthcare provider in specialties withlarger number of billing codes. Grouping the calculation of mutualentropy by specialty/classes of healthcare providers may enablecomparison of billing practices among similar types of healthcareproviders.

In some embodiments, the billing engine 126 may weigh differentparameters for the mutual entropy calculation differently. For example,the billing engine 126 may focus on (weigh more heavily) certain kindsof treatments, equipment, and/or specialties that are more prone tobeing subject of fraudulent billing practices. As another example, thebilling engine 126 may discard (weigh less heavily) most common codesused by healthcare providers (e.g., codes expected to be billed forevery/most patients). Such codes may be of such high volume that theymay skew the mutual entropy calculation and hide misuse of less commoncodes. Disregarding such codes may allow the mutual entropy calculation(and the healthcare metric) to reflect the misuse of less common codes.

In some embodiments, the billing engine 126 may determine mutual entropyby different time periods (e.g., per visit, per a number of visits,daily, weekly, monthly, yearly, etc.), by the healthcare provider'sspecialty, and/or by specific codes (e.g., CPT code) for differentpatients/healthcare providers. For example, determining mutual entropybased on treatments provided to patients over a three month period mayprovide different indication of the healthcare provider's billingpractices than determining mutual entropy based on treatments providedto patients over a single visit.

In some embodiments, the billing engine 126 may use billing trends todetermine whether the healthcare providers are engaging in fraudulentbilling practices. The billing engine 126 may analyze the billingentries of healthcare providers to determine if the healthcare providersare increasing the use of more expensive billing codes over time (e.g.,the healthcare providers are starting to upcode, etc.). The billingengine 126 may analyze the billing entries to detect patterns of billingindependent of patients seen by the healthcare providers. For example,the healthcare providers may bill one or more particular codes at aregular interval (e.g., a healthcare provider may bill a particular codeevery thirty days regardless of the identities and/or the number ofpatients seen by the healthcare provider). The healthcare metric maycharacterize the pattern detected by the billing engine 126.

The billing engine 126 may analyze the billing entries to determinelevels and/or periodicity of billing that are independent of externalfactors. For example, healthcare providers may, on average, experiencefluctuations on the number of patients seen/number of billingentries/types of billing entries based on the time of the year and/orweather conditions. The billing engine 126 may analyze the billingentries to determine whether particular healthcare providers do notexperience fluctuations in billing experienced by other healthcareproviders (e.g., a particular healthcare provider's billing is notaffected by changes in weather, temperature, humidity, etc.).

In some embodiments, the billing engine 126 may access information aboutexternal factors to identify periods of time when the billings of thehealthcare providers may fluctuate. For example, the billing engine 126may access weather information to determine periods when healthcareproviders' billings are expected to decrease. The billing engine 126 mayanalyze the billing entries to determine which healthcare providers'billing entries stayed level or increased during those periods.

In some embodiments, the billing engine 126 may analyze billing entriesto determine when the providers and/or patients are engaging inunlikely/impossible activities. For example, the billing engine 126 mayanalyze billing entries to determine when a particular provider hasbilled more than 24 hours' worth of codes during a single day (billingan “impossible day”). As another example, the billing engine 126 mayanalyze billing entries to determine when a particular patient'smultiple visits to a healthcare provider and/or visits to multiplehealthcare providers in a set amount of time may be of suspect. Forexample, a particular patient may have visited more healthcareproviders/had more visits to a healthcare provider than would be likelyduring a given time period (e.g., a day). As another example, aparticular patient may have visited healthcare providers located farfrom each other such that the timing of the visit (e.g., visited duringthe same day) is unlikely. Such unlikely visits by a patient mayindicate that one or more healthcare providers are engaging in medicalidentify theft (using patient information to bill for patients notseen).

In some embodiments, one or more healthcare metrics may be determinedusing the systems/methods/non-transitory computer readable medium asdisclosed in application Ser. No. 15/181,712, “FRAUD LEAD DETECTIONSYSTEM FOR EFFICIENTLY PROCESSING DATABASE-STORED DATA AND AUTOMATICALLYGENERATING NATURAL LANGUAGE EXPLANATORY INFORMATION OF SYSTEM RESULTSFOR DISPLAY IN INTERACTIVE USER INTERFACES,” filed on Jun. 14, 2016,which is hereby incorporated by reference in its entirety.

In various embodiments, the lead engine 116 may be configured to comparea healthcare metric to a healthcare threshold and generate one or moreleads for investigation based on the comparison. A healthcare thresholdmay include a static value and/or a dynamic value to which thehealthcare metric may be compared. A healthcare threshold may be setmanually (e.g., by a user) and/or may be set automatically. For example,for different types of healthcare metrics (e.g., determined based onpharmacy events and clinical events, opiate doses, billing pattern,mutual entropy, etc.), a user may manually set the healthcare thresholdsuch that healthcare metrics that meet, exceed, or fall below thehealthcare threshold are used to generate leads for investigation. Asanother example, for different types of healthcare metrics, the leadengine 116 may determine the healthcare threshold based on aggregationof healthcare metrics such that the healthcare threshold represents acertain statistical deviation from the aggregated healthcare metrics.

For example, for healthcare metrics of healthcare providers and/orpatients determined based on pharmacy events and clinical events, thehealthcare threshold may be determined based on an expected numbers ofpharmacy events associated with clinical events (e.g., ratio of numbersof pharmacy events to numbers of associated clinical events) and/orbased on the duration between the occurrences of pharmacy events andassociated clinical events (e.g., does a clinical event occur within acertain duration before and/or after a pharmacy event; how many clinicalevents occur within a certain duration before and/or after pharmacyevents). Based on the healthcare metric of the healthcare providersand/or the patients not satisfying the healthcare threshold, the leadengine 116 may identify as leads for investigation one or more of thecorresponding healthcare providers, the patients, and/or the healthcareevents (e.g., clinical event, pharmacy event, etc.).

As another example, for healthcare metrics of healthcare providersand/or patients determined based on opiate doses received by patients,the healthcare threshold may be determined based on a certain amount ofopiate doses/morphine equivalent. Based on the healthcare metric of thehealthcare providers and/or the patients not satisfying the healthcarethreshold, the lead engine 116 may identify as leads for investigationone or more of the corresponding healthcare providers, the patients,and/or the healthcare events (e.g., clinical event, pharmacy event,etc.).

As another example, for healthcare metrics of healthcare providersdetermined based on mutual entropy, the healthcare threshold may bedetermined based on a value indicating a certain dependence/independencebetween the healthcare providers' billings and the patients seen by thehealthcare providers. Based on the healthcare metric of the healthcareproviders not satisfying the healthcare threshold, the lead engine 116may identify as leads for investigation one or more of the correspondinghealthcare providers.

As another example, for healthcare metrics of healthcare providersdetermined based on billing trends, the healthcare threshold may bedetermined based on a value indicating a certain trend/pattern ofbilling. For example, the healthcare threshold may be set based on amaximum amount of billings and/or increase in the use of more expensivebilling codes for a set duration of time, based on the level/periodicityof billings that are independent of external factors, and/or based onthe number/level of unlikely/impossible activities reflected by theclaims. Based on the healthcare metric of the healthcare providers notsatisfying the healthcare threshold, the lead engine 116 may identify asleads for investigation one or more of the corresponding healthcareproviders.

In some embodiments, the lead engine 116 may use the comparison ofhealthcare metrics to healthcare thresholds as one among multiplefactors for generating leads for investigation. The lead engine 116 mayreview other factors when a healthcare metric does not satisfy thehealthcare threshold. For example, with respect to healthcare metrics ofhealthcare providers and/or patients determined based on opiate dosesreceived by patients, other factors may include whether a patient has arecorded history of displaying characteristics/behaviors of a drugdependent person (e.g., frequent visits to the emergency room, seeingmany different doctors, visiting many different pharmacies, etc.). Thelead engine 116 may use a classifier to return a probability, based onthe comparison of the healthcare metric to healthcare thresholds andother factors, that a patient is a drug seeker.

As another example, with respect to healthcare metrics of healthcareproviders determined based on mutual entropy, other factors may includetotal billing and/or total volume of services/products provided by thehealthcare providers. The lead engine 116 may identify as a lead one ormore healthcare providers whose healthcare metric does not satisfy thehealthcare threshold and whose totally billing/volume is higher thanothers (e.g., top bills, top volumes).

In some embodiments, the lead engine 116 may use multiple comparisonsbetween healthcare metrics and healthcare thresholds to generate leadsfor investigation. For example, the lead engine 116 may identifypatients whose healthcare metrics (e.g., determined based on amount ofopiate doses and/or other factors) indicate that the patients may bedrug seekers and identify healthcare providers whose healthcare metrics(e.g., determined based on clinical events and pharmacy events) indicatethe healthcare providers may be practicing bad pain management. Theoverlap between the identified patients and the identified healthcareproviders (including healthcare providers potentially practicing badpain management and seeing potentially drug seeking patients) may beidentified as leads for investigation.

In various embodiments, the lead engine 116 is configured to generateadditional leads for investigation based on previously generated leads.For example, based on a lead identifying healthcare providers providingunnecessary amounts of drugs, the lead engine 116 may generate leads forinvestigations patients who see the identified healthcare providers. Thepatients who were prescribed higher amounts opiate doses and/ordisplaying characteristics/behaviors of drug seeker may be identified asleads. As another example, a lead identifying a patient as a potentialdrug seeker may be used to track which healthcare providers haveprovided prescription for the patient. A healthcare providers who seeand/or provide prescriptions for more patients identified as potentialdrug seeker may be identified as leads. Backtracking leads of healthcareproviders may enable construction of a network model that provides aview of how tightly connected the healthcare providers are through theirpatients. For example, the network model may indicate which healthcareproviders may be connected in their misuse of the healthcare systemand/or may indicate which healthcare providers a drug seeking patientmay turn to if one of the identified healthcare providers is shut down.

In some embodiments, one or more healthcare thresholds may bedetermined, and/or one or more leads may be identified and/or reportedusing the systems/methods/non-transitory computer readable medium asdisclosed in application Ser. No. 15/181,712, “FRAUD LEAD DETECTIONSYSTEM FOR EFFICIENTLY PROCESSING DATABASE-STORED DATA AND AUTOMATICALLYGENERATING NATURAL LANGUAGE EXPLANATORY INFORMATION OF SYSTEM RESULTSFOR DISPLAY IN INTERACTIVE USER INTERFACES,” filed on Jun. 14, 2016,incorporated supra.

FIG. 2 illustrates an exemplary process 200 for generating leads basedon pharmacy events and clinical event. The process 200 may beimplemented in various environments, including, for example, theenvironment of FIG. 1. At block 202, one or more pharmacy events for apatient/healthcare provider may be identified from one or more databasesof claims. At block 204, one or more clinical events for apatient/healthcare provider may be identified from one or more databasesof claims. At block 206, one or more matches between the pharmacyevent(s) and the clinical event(s) may be determined. A match betweenthe pharmacy event(s) and the clinical event(s) may exist when thetiming of the pharmacy event(s) and the clinical event(s) indicate thatthe pharmacy event(s) occurred as a result of the clinical event(s). Atblock 208, one or more leads may be generated based on the matchesbetween the pharmacy event(s) and the clinical event(s). Lead(s) may begenerated when the matches between the pharmacy event(s) and theclinical event(s) indicate less than a desired number/timing of clinicalevents for pharmacy events (e.g., a patient getting prescriptions filledwithout seeing a doctor, a doctor writing prescriptions for a patientwithout seeing the patients, etc.).

FIG. 3 illustrates an exemplary process 300 for generating leads basedon opiate doses. The process 300 may be implemented in variousenvironments, including, for example, the environment of FIG. 1. Atblock 302, the dose amounts for a patient may be determined. At block304, the morphine equivalent of the dose amounts may be determined. Atblock 306, one or more drug seeking behaviors/characteristics of thepatient may be determined. At block 308, the probability that thepatient is a drug seeker may be determined based on the morphineequivalent of the dose amounts and the drug seekingbehaviors/characteristics of the patients. At block 310, identifies ofone or more potentially drug-seeking patients may be used to backtrackto the healthcare providers who provided prescription to these patients.At block 312, one or more leads may be generated based on the identifiedpatients and/or identified healthcare providers.

FIG. 4 illustrates an exemplary process 400 for generating leads basedon billing patterns. The process 400 may be implemented in variousenvironments, including, for example, the environment of FIG. 1. Atblock 402, billing patterns of one or more healthcare providers may bedetermined. At block 404, potentially fraudulent billing patterns may beidentified. In some embodiments, identifying potentially fraudulentbilling patterns may include the use of mutual entropy 404A to determinethe dependence/independence between the healthcare providers' billingsand the patients seen by the healthcare provider. In some embodiments,identifying potentially fraudulent billing patterns may include thedetection of trends 404B that indicate upcoding. In some embodiments,identifying potentially fraudulent billing patterns may includedetermining the influence/lack of influence of external factors 404C onbilling patterns. In some embodiments, identifying potentiallyfraudulent billing patterns may include detecting unlikely/impossibleactivities 404D. At block 406, one or more leads may be generated basedon the potentially fraudulent billing patterns.

FIG. 5 illustrates a flowchart of an example method 500, according tovarious embodiments of the present disclosure. The method 500 may beimplemented in various environments including, for example, theenvironment 100 of FIG. 1. The operations of method 500 presented beloware intended to be illustrative. Depending on the implementation, theexample method 500 may include additional, fewer, or alternative stepsperformed in various orders or in parallel. The example method 500 maybe implemented in various computing systems or devices including one ormore processors.

At block 502, a database of claims may be analyzed. At block 504, ahealthcare metric may be determined based on the analyses of thedatabase of claims. At block 506, the healthcare metric may be comparedto a healthcare threshold. At block 508, based on the comparison of thehealthcare metric to the healthcare threshold, a first lead forinvestigation may be generated.

Hardware Implementation

The techniques described herein are implemented by one or morespecial-purpose computing devices. The special-purpose computing devicesmay be hard-wired to perform the techniques, or may include circuitry ordigital electronic devices such as one or more application-specificintegrated circuits (ASICs) or field programmable gate arrays (FPGAs)that are persistently programmed to perform the techniques, or mayinclude one or more hardware processors programmed to perform thetechniques pursuant to program instructions in firmware, memory, otherstorage, or a combination. Such special-purpose computing devices mayalso combine custom hard-wired logic, ASICs, or FPGAs with customprogramming to accomplish the techniques. The special-purpose computingdevices may be desktop computer systems, server computer systems,portable computer systems, handheld devices, networking devices or anyother device or combination of devices that incorporate hard-wiredand/or program logic to implement the techniques.

Computing device(s) are generally controlled and coordinated byoperating system software, such as iOS, Android, Chrome OS, Windows XP,Windows Vista, Windows 7, Windows 8, Windows Server, Windows CE, Unix,Linux, SunOS, Solaris, iOS, Blackberry OS, VxWorks, or other compatibleoperating systems. In other embodiments, the computing device may becontrolled by a proprietary operating system. Conventional operatingsystems control and schedule computer processes for execution, performmemory management, provide file system, networking, I/O services, andprovide a user interface functionality, such as a graphical userinterface (“GUI”), among other things.

FIG. 6 is a block diagram that illustrates a computer system 600 uponwhich any of the embodiments described herein may be implemented. Thecomputer system 600 includes a bus 602 or other communication mechanismfor communicating information, one or more hardware processors 604coupled with bus 602 for processing information. Hardware processor(s)604 may be, for example, one or more general purpose microprocessors.

The computer system 600 also includes a main memory 606, such as arandom access memory (RAM), cache and/or other dynamic storage devices,coupled to bus 602 for storing information and instructions to beexecuted by processor 604. Main memory 606 also may be used for storingtemporary variables or other intermediate information during executionof instructions to be executed by processor 604. Such instructions, whenstored in storage media accessible to processor 604, render computersystem 600 into a special-purpose machine that is customized to performthe operations specified in the instructions.

The computer system 600 further includes a read only memory (ROM) 608 orother static storage device coupled to bus 602 for storing staticinformation and instructions for processor 604. A storage device 610,such as a magnetic disk, optical disk, or USB thumb drive (Flash drive),etc., is provided and coupled to bus 602 for storing information andinstructions.

The computer system 600 may be coupled via bus 602 to a display 612,such as a cathode ray tube (CRT) or LCD display (or touch screen), fordisplaying information to a computer user. An input device 614,including alphanumeric and other keys, is coupled to bus 602 forcommunicating information and command selections to processor 604.Another type of user input device is cursor control 616, such as amouse, a trackball, or cursor direction keys for communicating directioninformation and command selections to processor 604 and for controllingcursor movement on display 612. This input device typically has twodegrees of freedom in two axes, a first axis (e.g., x) and a second axis(e.g., y), that allows the device to specify positions in a plane. Insome embodiments, the same direction information and command selectionsas cursor control may be implemented via receiving touches on a touchscreen without a cursor.

The computing system 600 may include a user interface module toimplement a GUI that may be stored in a mass storage device asexecutable software codes that are executed by the computing device(s).This and other modules may include, by way of example, components, suchas software components, object-oriented software components, classcomponents and task components, processes, functions, attributes,procedures, subroutines, segments of program code, drivers, firmware,microcode, circuitry, data, databases, data structures, tables, arrays,and variables.

In general, the word “module,” as used herein, refers to logic embodiedin hardware or firmware, or to a collection of software instructions,possibly having entry and exit points, written in a programminglanguage, such as, for example, Java, C or C++. A software module may becompiled and linked into an executable program, installed in a dynamiclink library, or may be written in an interpreted programming languagesuch as, for example, BASIC, Perl, or Python. It will be appreciatedthat software modules may be callable from other modules or fromthemselves, and/or may be invoked in response to detected events orinterrupts. Software modules configured for execution on computingdevices may be provided on a computer readable medium, such as a compactdisc, digital video disc, flash drive, magnetic disc, or any othertangible medium, or as a digital download (and may be originally storedin a compressed or installable format that requires installation,decompression or decryption prior to execution). Such software code maybe stored, partially or fully, on a memory device of the executingcomputing device, for execution by the computing device. Softwareinstructions may be embedded in firmware, such as an EPROM. It will befurther appreciated that hardware modules may be comprised of connectedlogic units, such as gates and flip-flops, and/or may be comprised ofprogrammable units, such as programmable gate arrays or processors. Themodules or computing device functionality described herein arepreferably implemented as software modules, but may be represented inhardware or firmware. Generally, the modules described herein refer tological modules that may be combined with other modules or divided intosub-modules despite their physical organization or storage.

The computer system 600 may implement the techniques described hereinusing customized hard-wired logic, one or more ASICs or FPGAs, firmwareand/or program logic which in combination with the computer systemcauses or programs computer system 600 to be a special-purpose machine.According to one embodiment, the techniques herein are performed bycomputer system 600 in response to processor(s) 604 executing one ormore sequences of one or more instructions contained in main memory 606.Such instructions may be read into main memory 606 from another storagemedium, such as storage device 610. Execution of the sequences ofinstructions contained in main memory 606 causes processor(s) 604 toperform the process steps described herein. In alternative embodiments,hard-wired circuitry may be used in place of or in combination withsoftware instructions.

The term “non-transitory media,” and similar terms, as used hereinrefers to any media that store data and/or instructions that cause amachine to operate in a specific fashion. Such non-transitory media maycomprise non-volatile media and/or volatile media. Non-volatile mediaincludes, for example, optical or magnetic disks, such as storage device610. Volatile media includes dynamic memory, such as main memory 606.Common forms of non-transitory media include, for example, a floppydisk, a flexible disk, hard disk, solid state drive, magnetic tape, orany other magnetic data storage medium, a CD-ROM, any other optical datastorage medium, any physical medium with patterns of holes, a RAM, aPROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip orcartridge, and networked versions of the same.

Non-transitory media is distinct from but may be used in conjunctionwith transmission media. Transmission media participates in transferringinformation between non-transitory media. For example, transmissionmedia includes coaxial cables, copper wire and fiber optics, includingthe wires that comprise bus 602. Transmission media can also take theform of acoustic or light waves, such as those generated duringradio-wave and infra-red data communications.

Various forms of media may be involved in carrying one or more sequencesof one or more instructions to processor 604 for execution. For example,the instructions may initially be carried on a magnetic disk or solidstate drive of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 600 canreceive the data on the telephone line and use an infra-red transmitterto convert the data to an infra-red signal. An infra-red detector canreceive the data carried in the infra-red signal and appropriatecircuitry can place the data on bus 602. Bus 602 carries the data tomain memory 606, from which processor 604 retrieves and executes theinstructions. The instructions received by main memory 606 may retrievesand executes the instructions. The instructions received by main memory606 may optionally be stored on storage device 610 either before orafter execution by processor 604.

The computer system 600 also includes a communication interface 618coupled to bus 602. Communication interface 618 provides a two-way datacommunication coupling to one or more network links that are connectedto one or more local networks. For example, communication interface 618may be an integrated services digital network (ISDN) card, cable modem,satellite modem, or a modem to provide a data communication connectionto a corresponding type of telephone line. As another example,communication interface 618 may be a local area network (LAN) card toprovide a data communication connection to a compatible LAN (or WANcomponent to communicated with a WAN). Wireless links may also beimplemented. In any such implementation, communication interface 618sends and receives electrical, electromagnetic or optical signals thatcarry digital data streams representing various types of information.

A network link typically provides data communication through one or morenetworks to other data devices. For example, a network link may providea connection through local network to a host computer or to dataequipment operated by an Internet Service Provider (ISP). The ISP inturn provides data communication services through the world wide packetdata communication network now commonly referred to as the “Internet”.Local network and Internet both use electrical, electromagnetic oroptical signals that carry digital data streams. The signals through thevarious networks and the signals on network link and throughcommunication interface 618, which carry the digital data to and fromcomputer system 600, are example forms of transmission media.

The computer system 600 can send messages and receive data, includingprogram code, through the network(s), network link and communicationinterface 618. In the Internet example, a server might transmit arequested code for an application program through the Internet, the ISP,the local network and the communication interface 618.

The received code may be executed by processor 604 as it is received,and/or stored in storage device 610, or other non-volatile storage forlater execution.

Each of the processes, methods, and algorithms described in thepreceding sections may be embodied in, and fully or partially automatedby, code modules executed by one or more computer systems or computerprocessors comprising computer hardware. The processes and algorithmsmay be implemented partially or wholly in application-specificcircuitry.

The various features and processes described above may be usedindependently of one another, or may be combined in various ways. Allpossible combinations and sub-combinations are intended to fall withinthe scope of this disclosure. In addition, certain method or processblocks may be omitted in some implementations. The methods and processesdescribed herein are also not limited to any particular sequence, andthe blocks or states relating thereto can be performed in othersequences that are appropriate. For example, described blocks or statesmay be performed in an order other than that specifically disclosed, ormultiple blocks or states may be combined in a single block or state.The example blocks or states may be performed in serial, in parallel, orin some other manner. Blocks or states may be added to or removed fromthe disclosed example embodiments. The example systems and componentsdescribed herein may be configured differently than described. Forexample, elements may be added to, removed from, or rearranged comparedto the disclosed example embodiments.

Conditional language, such as, among others, “can,” “could,” “might,” or“may,” unless specifically stated otherwise, or otherwise understoodwithin the context as used, is generally intended to convey that certainembodiments include, while other embodiments do not include, certainfeatures, elements and/or steps. Thus, such conditional language is notgenerally intended to imply that features, elements and/or steps are inany way required for one or more embodiments or that one or moreembodiments necessarily include logic for deciding, with or without userinput or prompting, whether these features, elements and/or steps areincluded or are to be performed in any particular embodiment.

Any process descriptions, elements, or blocks in the flow diagramsdescribed herein and/or depicted in the attached figures should beunderstood as potentially representing modules, segments, or portions ofcode which include one or more executable instructions for implementingspecific logical functions or steps in the process. Alternateimplementations are included within the scope of the embodimentsdescribed herein in which elements or functions may be deleted, executedout of order from that shown or discussed, including substantiallyconcurrently or in reverse order, depending on the functionalityinvolved, as would be understood by those skilled in the art.

It should be emphasized that many variations and modifications may bemade to the above-described embodiments, the elements of which are to beunderstood as being among other acceptable examples. All suchmodifications and variations are intended to be included herein withinthe scope of this disclosure. The foregoing description details certainembodiments of the invention. It will be appreciated, however, that nomatter how detailed the foregoing appears in text, the invention can bepracticed in many ways. As is also stated above, it should be noted thatthe use of particular terminology when describing certain features oraspects of the invention should not be taken to imply that theterminology is being re-defined herein to be restricted to including anyspecific characteristics of the features or aspects of the inventionwith which that terminology is associated. The scope of the inventionshould therefore be construed in accordance with the appended claims andany equivalents thereof.

Engines, Components, and Logic

Certain embodiments are described herein as including logic or a numberof components, engines, or mechanisms. Engines may constitute eithersoftware engines (e.g., code embodied on a machine-readable medium) orhardware engines. A “hardware engine” is a tangible unit capable ofperforming certain operations and may be configured or arranged in acertain physical manner. In various example embodiments, one or morecomputer systems (e.g., a standalone computer system, a client computersystem, or a server computer system) or one or more hardware engines ofa computer system (e.g., a processor or a group of processors) may beconfigured by software (e.g., an application or application portion) asa hardware engine that operates to perform certain operations asdescribed herein.

In some embodiments, a hardware engine may be implemented mechanically,electronically, or any suitable combination thereof. For example, ahardware engine may include dedicated circuitry or logic that ispermanently configured to perform certain operations. For example, ahardware engine may be a special-purpose processor, such as aField-Programmable Gate Array (FPGA) or an Application SpecificIntegrated Circuit (ASIC). A hardware engine may also includeprogrammable logic or circuitry that is temporarily configured bysoftware to perform certain operations. For example, a hardware enginemay include software executed by a general-purpose processor or otherprogrammable processor. Once configured by such software, hardwareengines become specific machines (or specific components of a machine)uniquely tailored to perform the configured functions and are no longergeneral-purpose processors. It will be appreciated that the decision toimplement a hardware engine mechanically, in dedicated and permanentlyconfigured circuitry, or in temporarily configured circuitry (e.g.,configured by software) may be driven by cost and time considerations.

Accordingly, the phrase “hardware engine” should be understood toencompass a tangible entity, be that an entity that is physicallyconstructed, permanently configured (e.g., hardwired), or temporarilyconfigured (e.g., programmed) to operate in a certain manner or toperform certain operations described herein. As used herein,“hardware-implemented engine” refers to a hardware engine. Consideringembodiments in which hardware engines are temporarily configured (e.g.,programmed), each of the hardware engines need not be configured orinstantiated at any one instance in time. For example, where a hardwareengine comprises a general-purpose processor configured by software tobecome a special-purpose processor, the general-purpose processor may beconfigured as respectively different special-purpose processors (e.g.,comprising different hardware engines) at different times. Softwareaccordingly configures a particular processor or processors, forexample, to constitute a particular hardware engine at one instance oftime and to constitute a different hardware engine at a differentinstance of time.

Hardware engines can provide information to, and receive informationfrom, other hardware engines. Accordingly, the described hardwareengines may be regarded as being communicatively coupled. Where multiplehardware engines exist contemporaneously, communications may be achievedthrough signal transmission (e.g., over appropriate circuits and buses)between or among two or more of the hardware engines. In embodiments inwhich multiple hardware engines are configured or instantiated atdifferent times, communications between such hardware engines may beachieved, for example, through the storage and retrieval of informationin memory structures to which the multiple hardware engines have access.For example, one hardware engine may perform an operation and store theoutput of that operation in a memory device to which it iscommunicatively coupled. A further hardware engine may then, at a latertime, access the memory device to retrieve and process the storedoutput. Hardware engines may also initiate communications with input oroutput devices, and can operate on a resource (e.g., a collection ofinformation).

The various operations of example methods described herein may beperformed, at least partially, by one or more processors that aretemporarily configured (e.g., by software) or permanently configured toperform the relevant operations. Whether temporarily or permanentlyconfigured, such processors may constitute processor-implemented enginesthat operate to perform one or more operations or functions describedherein. As used herein, “processor-implemented engine” refers to ahardware engine implemented using one or more processors.

Similarly, the methods described herein may be at least partiallyprocessor-implemented, with a particular processor or processors beingan example of hardware. For example, at least some of the operations ofa method may be performed by one or more processors orprocessor-implemented engines. Moreover, the one or more processors mayalso operate to support performance of the relevant operations in a“cloud computing” environment or as a “software as a service” (SaaS).For example, at least some of the operations may be performed by a groupof computers (as examples of machines including processors), with theseoperations being accessible via a network (e.g., the Internet) and viaone or more appropriate interfaces (e.g., an Application ProgramInterface (API)).

The performance of certain of the operations may be distributed amongthe processors, not only residing within a single machine, but deployedacross a number of machines. In some example embodiments, the processorsor processor-implemented engines may be located in a single geographiclocation (e.g., within a home environment, an office environment, or aserver farm). In other example embodiments, the processors orprocessor-implemented engines may be distributed across a number ofgeographic locations.

Language

Throughout this specification, plural instances may implementcomponents, operations, or structures described as a single instance.Although individual operations of one or more methods are illustratedand described as separate operations, one or more of the individualoperations may be performed concurrently, and nothing requires that theoperations be performed in the order illustrated. Structures andfunctionality presented as separate components in example configurationsmay be implemented as a combined structure or component. Similarly,structures and functionality presented as a single component may beimplemented as separate components. These and other variations,modifications, additions, and improvements fall within the scope of thesubject matter herein.

Although an overview of the subject matter has been described withreference to specific example embodiments, various modifications andchanges may be made to these embodiments without departing from thebroader scope of embodiments of the present disclosure. Such embodimentsof the subject matter may be referred to herein, individually orcollectively, by the term “invention” merely for convenience and withoutintending to voluntarily limit the scope of this application to anysingle disclosure or concept if more than one is, in fact, disclosed.

The embodiments illustrated herein are described in sufficient detail toenable those skilled in the art to practice the teachings disclosed.Other embodiments may be used and derived therefrom, such thatstructural and logical substitutions and changes may be made withoutdeparting from the scope of this disclosure. The Detailed Description,therefore, is not to be taken in a limiting sense, and the scope ofvarious embodiments is defined only by the appended claims, along withthe full range of equivalents to which such claims are entitled.

It will be appreciated that an “engine,” “system,” “data store,” and/or“database” may comprise software, hardware, firmware, and/or circuitry.In one example, one or more software programs comprising instructionscapable of being executable by a processor may perform one or more ofthe functions of the engines, data stores, databases, or systemsdescribed herein. In another example, circuitry may perform the same orsimilar functions. Alternative embodiments may comprise more, less, orfunctionally equivalent engines, systems, data stores, or databases, andstill be within the scope of present embodiments. For example, thefunctionality of the various systems, engines, data stores, and/ordatabases may be combined or divided differently.

“Open source” software is defined herein to be source code that allowsdistribution as source code as well as compiled form, with awell-publicized and indexed means of obtaining the source, optionallywith a license that allows modifications and derived works.

The data stores described herein may be any suitable structure (e.g., anactive database, a relational database, a self-referential database, atable, a matrix, an array, a flat file, a documented-oriented storagesystem, a non-relational No-SQL system, and the like), and may becloud-based or otherwise.

As used herein, the term “or” may be construed in either an inclusive orexclusive sense. Moreover, plural instances may be provided forresources, operations, or structures described herein as a singleinstance. Additionally, boundaries between various resources,operations, engines, engines, and data stores are somewhat arbitrary,and particular operations are illustrated in a context of specificillustrative configurations. Other allocations of functionality areenvisioned and may fall within a scope of various embodiments of thepresent disclosure. In general, structures and functionality presentedas separate resources in the example configurations may be implementedas a combined structure or resource. Similarly, structures andfunctionality presented as a single resource may be implemented asseparate resources. These and other variations, modifications,additions, and improvements fall within a scope of embodiments of thepresent disclosure as represented by the appended claims. Thespecification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense.

Conditional language, such as, among others, “can,” “could,” “might,” or“may,” unless specifically stated otherwise, or otherwise understoodwithin the context as used, is generally intended to convey that certainembodiments include, while other embodiments do not include, certainfeatures, elements and/or steps. Thus, such conditional language is notgenerally intended to imply that features, elements and/or steps are inany way required for one or more embodiments or that one or moreembodiments necessarily include logic for deciding, with or without userinput or prompting, whether these features, elements and/or steps areincluded or are to be performed in any particular embodiment.

Although the invention has been described in detail for the purpose ofillustration based on what is currently considered to be the mostpractical and preferred implementations, it is to be understood thatsuch detail is solely for that purpose and that the invention is notlimited to the disclosed implementations, but, on the contrary, isintended to cover modifications and equivalent arrangements that arewithin the spirit and scope of the appended claims. For example, it isto be understood that the present invention contemplates that, to theextent possible, one or more features of any embodiment can be combinedwith one or more features of any other embodiment.

1. A system comprising: one or more processors; and a memory storinginstructions that, when executed by the one or more processors, inconjunction with a particular machine learning model for a subset of theinstructions, cause the system to perform: obtaining data of entitiesfrom databases based on a frequency at which the data changes; storingthe obtained data in a repository; using the particular machine learningmodel, detecting misuse among entities, wherein training of theparticular machine learning model comprises: obtaining a first trainingdataset from among known outcomes of previous analyses based on firstsources verified to have been associated with misuse; and obtaining asecond training dataset from among known outcomes of previous analysesbased on second sources verified to have been nonassociated with misuse;and outputting an indication of the detected misuse.
 2. The system ofclaim 1, wherein the obtaining of the first training dataset and thesecond training dataset is further based on a rate of convergence of theparticular machine learning model resulting from training using thefirst sources and the second sources.
 3. The system of claim 1, whereinthe first sources are associated with highest rates of convergence ofthe particular machine learning model compared to other sources verifiedto have been associated with misuse, and the second sources areassociated with highest rates of convergence of the particular machinelearning model compared to other sources verified to have beennonassociated with misuse.
 4. The system of claim 1, wherein the firstsources are associated with highest uncertainties compared to othersources verified to have been associated with misuse, and the secondsources are associated with highest uncertainties compared to othersources verified to have been nonassociated with misuse.
 5. The systemof claim 1, wherein the instructions further cause the system to performtranslating indicators of misuse within the first training dataset andthe second training dataset into particular metrics, metric values, orweights, wherein the particular metrics, metric values, or weights areused to iteratively train the particular machine learning model.
 6. Thesystem of claim 5, wherein the iterative training comprises modifyingweights assigned to signals of the particular machine learning model. 7.The system of claim 1, wherein the particular machine learning modelcomprises a nearest neighbor model.
 8. The system of claim 1, whereinthe first training dataset and the second training dataset are obtainedfrom a different model.
 9. The system of claim 1, wherein theinstructions further cause the system to perform obtaining a thirdtraining dataset from among previous analyses by selecting previousthird sources that were indeterminate regarding an association withmisuse.
 10. The system of claim 1, wherein the instructions furthercause the system to perform appending, to an interface, a naturallanguage explanation of the detected misuse and a correlation betweenthe detected misuse and a previous instance of misuse.
 11. A methodimplemented by a computing system including one or more processors andstorage media storing machine-readable instructions, wherein the methodis performed using the one or more processors, in conjunction with aparticular machine learning model, the method comprising: obtaining dataof entities from databases based on a frequency at which the datachanges; storing the obtained data in a repository; using the particularmachine learning model, detecting misuse among entities, whereintraining of the particular machine learning model comprises: obtaining afirst training dataset from among known outcomes of previous analysesbased on first sources verified to have been associated with misuse; andobtaining a second training dataset from among known outcomes ofprevious analyses based on second sources verified to have beennonassociated with misuse; and outputting an indication of the detectedmisuse.
 12. The method of claim 11, wherein the obtaining of the firsttraining dataset and the second training dataset is further based on arate of convergence of the particular machine learning model resultingfrom training using the first sources and the second sources.
 13. Themethod of claim 11, wherein the first sources are associated withhighest rates of convergence of the particular machine learning modelcompared to other sources verified to have been associated with misuse,and the second sources are associated with highest rates of convergenceof the particular machine learning model compared to other sourcesverified to have been nonassociated with misuse.
 14. The method of claim11, wherein the first sources are associated with highest uncertaintiescompared to other sources verified to have been associated with misuse,and the second sources are associated with highest uncertaintiescompared to other sources verified to have been nonassociated withmisuse.
 15. The method of claim 11, further comprising translatingindicators of misuse within the first training dataset and the secondtraining dataset into particular metrics, metric values, or weights,wherein the particular metrics, metric values, or weights are used toiteratively train the particular machine learning model.
 16. The methodof claim 15, wherein the iterative training comprises modifying weightsassigned to signals of the particular machine learning model.
 17. Themethod of claim 11, wherein the particular machine learning modelcomprises a nearest neighbor model.
 18. The method of claim 11, whereinthe first training dataset and the second training dataset are obtainedfrom a different model.
 19. The method of claim 11, further comprisingobtaining a third training dataset from among previous analyses byselecting previous third sources that were indeterminate regarding anassociation with misuse.
 20. The method of claim 11, further comprisingappending, to an interface, a natural language explanation of thedetected misuse and a correlation between the detected misuse and aprevious instance of misuse.